Dataset statistics
| Number of variables | 13 |
|---|---|
| Number of observations | 824146 |
| Missing cells | 1304368 |
| Missing cells (%) | 12.2% |
| Duplicate rows | 0 |
| Duplicate rows (%) | 0.0% |
| Total size in memory | 81.7 MiB |
| Average record size in memory | 104.0 B |
Variable types
| NUM | 7 |
|---|---|
| CAT | 6 |
Reproduction
| Analysis started | 2020-12-13 17:28:52.343296 |
|---|---|
| Analysis finished | 2020-12-13 17:29:56.590065 |
| Duration | 1 minute and 4.25 seconds |
| Version | pandas-profiling v2.8.0 |
| Command line | pandas_profiling --config_file config.yaml [YOUR_FILE.csv] |
| Download configuration | config.yaml |
DBN has a high cardinality: 1631 distinct values | High cardinality |
School Name has a high cardinality: 1627 distinct values | High cardinality |
# Days Present is highly correlated with # Total Days and 1 other fields | High correlation |
# Total Days is highly correlated with # Days Present and 1 other fields | High correlation |
# Contributing 20+ Total Days is highly correlated with # Total Days and 1 other fields | High correlation |
# Chronically Absent is highly correlated with # Days Absent | High correlation |
# Days Absent is highly correlated with # Chronically Absent | High correlation |
Demographic Variable is highly correlated with Demographic Category | High correlation |
Demographic Category is highly correlated with Demographic Variable | High correlation |
# Days Absent has 215727 (26.2%) missing values | Missing |
# Days Present has 215727 (26.2%) missing values | Missing |
% Attendance has 215727 (26.2%) missing values | Missing |
# Contributing 20+ Total Days has 215727 (26.2%) missing values | Missing |
# Chronically Absent has 220730 (26.8%) missing values | Missing |
% Chronically Absent has 220730 (26.8%) missing values | Missing |
# Chronically Absent has 18887 (2.3%) zeros | Zeros |
% Chronically Absent has 18887 (2.3%) zeros | Zeros |
| Distinct count | 1631 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.3 MiB |
| 31R080 | 1168 |
|---|---|
| 01M539 | 1117 |
| 75Q993 | 1086 |
| 75X188 | 1071 |
| 75K369 | 1057 |
| Other values (1626) |
| Value | Count | Frequency (%) | |
| 31R080 | 1168 | 0.1% | |
| 01M539 | 1117 | 0.1% | |
| 75Q993 | 1086 | 0.1% | |
| 75X188 | 1071 | 0.1% | |
| 75K369 | 1057 | 0.1% | |
| 75K053 | 1055 | 0.1% | |
| 75M138 | 1048 | 0.1% | |
| 75Q811 | 1044 | 0.1% | |
| 75M226 | 1040 | 0.1% | |
| 75Q177 | 1033 | 0.1% | |
| Other values (1621) | 813427 | 98.7% |
Length
| Max length | 6 |
|---|---|
| Median length | 6 |
| Mean length | 6 |
| Min length | 6 |
| Distinct count | 1627 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.3 MiB |
| P.S. 212 | 1352 |
|---|---|
| P.S. 253 | 1244 |
| The Michael J. Petrides School | 1168 |
| New Explorations into Science, Technology and Math | 1117 |
| P.S. Q993 | 1086 |
| Other values (1622) |
| Value | Count | Frequency (%) | |
| P.S. 212 | 1352 | 0.2% | |
| P.S. 253 | 1244 | 0.2% | |
| The Michael J. Petrides School | 1168 | 0.1% | |
| New Explorations into Science, Technology and Math | 1117 | 0.1% | |
| P.S. Q993 | 1086 | 0.1% | |
| P.S. X188 | 1071 | 0.1% | |
| P.S. K369 - Coy L. Cox School | 1057 | 0.1% | |
| P.S. K053 | 1055 | 0.1% | |
| P.S. 138 | 1048 | 0.1% | |
| P.S. Q811 | 1044 | 0.1% | |
| Other values (1617) | 812904 | 98.6% |
Length
| Max length | 50 |
|---|---|
| Median length | 26 |
| Mean length | 27.13571018 |
| Min length | 5 |
Grade
Categorical
| Distinct count | 15 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.3 MiB |
| All Grades | |
|---|---|
| 0K | 65516 |
| 1 | 65201 |
| 2 | 64559 |
| 3 | 63429 |
| Other values (10) |
| Value | Count | Frequency (%) | |
| All Grades | 129688 | 15.7% | |
| 0K | 65516 | 7.9% | |
| 1 | 65201 | 7.9% | |
| 2 | 64559 | 7.8% | |
| 3 | 63429 | 7.7% | |
| 4 | 62178 | 7.5% | |
| 5 | 61850 | 7.5% | |
| 6 | 45273 | 5.5% | |
| 8 | 41625 | 5.1% | |
| 7 | 41606 | 5.0% | |
| Other values (5) | 183221 | 22.2% |
Length
| Max length | 18 |
|---|---|
| Median length | 1 |
| Mean length | 3.305443453 |
| Min length | 1 |
Year
Categorical
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.3 MiB |
| 2015-16 | |
|---|---|
| 2016-17 | |
| 2017-18 | |
| 2018-19 | |
| 2014-15 |
| Value | Count | Frequency (%) | |
| 2015-16 | 138124 | 16.8% | |
| 2016-17 | 138093 | 16.8% | |
| 2017-18 | 137873 | 16.7% | |
| 2018-19 | 137532 | 16.7% | |
| 2014-15 | 137033 | 16.6% | |
| 2013-14 | 135491 | 16.4% |
Length
| Max length | 7 |
|---|---|
| Median length | 7 |
| Mean length | 7 |
| Min length | 7 |
| Distinct count | 6 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.3 MiB |
| Ethnicity | |
|---|---|
| Gender | |
| Poverty | |
| SWD Status | |
| ELL Status |
| Value | Count | Frequency (%) | |
| Ethnicity | 274610 | 33.3% | |
| Gender | 126575 | 15.4% | |
| Poverty | 125940 | 15.3% | |
| SWD Status | 116918 | 14.2% | |
| ELL Status | 115891 | 14.1% | |
| All Students | 64212 | 7.8% |
Length
| Max length | 12 |
|---|---|
| Median length | 9 |
| Mean length | 8.749850148 |
| Min length | 6 |
| Distinct count | 14 |
|---|---|
| Unique (%) | < 0.1% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Memory size | 6.3 MiB |
| All Students | 64212 |
|---|---|
| Poverty | 63998 |
| Male | 63353 |
| Hispanic | 63311 |
| Female | 63222 |
| Other values (9) |
| Value | Count | Frequency (%) | |
| All Students | 64212 | 7.8% | |
| Poverty | 63998 | 7.8% | |
| Male | 63353 | 7.7% | |
| Hispanic | 63311 | 7.7% | |
| Female | 63222 | 7.7% | |
| Not Poverty | 61942 | 7.5% | |
| Black | 60755 | 7.4% | |
| Not ELL | 60174 | 7.3% | |
| SWD | 59524 | 7.2% | |
| Not SWD | 57394 | 7.0% | |
| Other values (4) | 206261 | 25.0% |
Length
| Max length | 12 |
|---|---|
| Median length | 6 |
| Mean length | 6.387603653 |
| Min length | 3 |
| Distinct count | 86622 |
|---|---|
| Unique (%) | 10.5% |
| Missing | 0 |
| Missing (%) | 0.0% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 14454.985155542829 |
|---|---|
| Minimum | 1 |
| Maximum | 976375 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.3 MiB |
Quantile statistics
| Minimum | 1 |
|---|---|
| 5-th percentile | 178 |
| Q1 | 1849 |
| median | 6224 |
| Q3 | 14270 |
| 95-th percentile | 60832.75 |
| Maximum | 976375 |
| Range | 976374 |
| Interquartile range (IQR) | 12421 |
Descriptive statistics
| Standard deviation | 29094.9427 |
|---|---|
| Coefficient of variation (CV) | 2.012796442 |
| Kurtosis | 96.75291348 |
| Mean | 14454.98516 |
| Median Absolute Deviation (MAD) | 5173 |
| Skewness | 7.175055511 |
| Sum | 1.19130182e+10 |
| Variance | 846515690.5 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 178 | 10802 | 1.3% | |
| 356 | 6691 | 0.8% | |
| 176 | 4461 | 0.5% | |
| 534 | 4336 | 0.5% | |
| 182 | 3423 | 0.4% | |
| 180 | 3211 | 0.4% | |
| 712 | 2963 | 0.4% | |
| 352 | 2573 | 0.3% | |
| 890 | 2228 | 0.3% | |
| 364 | 1987 | 0.2% | |
| Other values (86612) | 781471 | 94.8% |
| Value | Count | Frequency (%) | |
| 1 | 1321 | 0.2% | |
| 2 | 1085 | 0.1% | |
| 3 | 727 | 0.1% | |
| 4 | 561 | 0.1% | |
| 5 | 558 | 0.1% |
| Value | Count | Frequency (%) | |
| 976375 | 1 | < 0.1% | |
| 976210 | 1 | < 0.1% | |
| 963768 | 1 | < 0.1% | |
| 954227 | 2 | < 0.1% | |
| 942934 | 1 | < 0.1% |
| Distinct count | 16802 |
|---|---|
| Unique (%) | 2.8% |
| Missing | 215727 |
| Missing (%) | 26.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 1491.2436002162983 |
|---|---|
| Minimum | 0.0 |
| Maximum | 105055.0 |
| Zeros | 7 |
| Zeros (%) | < 0.1% |
| Memory size | 6.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 96 |
| Q1 | 298 |
| median | 646 |
| Q3 | 1423 |
| 95-th percentile | 6003 |
| Maximum | 105055 |
| Range | 105055 |
| Interquartile range (IQR) | 1125 |
Descriptive statistics
| Standard deviation | 2864.866744 |
|---|---|
| Coefficient of variation (CV) | 1.921125927 |
| Kurtosis | 118.2651455 |
| Mean | 1491.2436 |
| Median Absolute Deviation (MAD) | 430 |
| Skewness | 7.952174774 |
| Sum | 907300940 |
| Variance | 8207461.462 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 158 | 687 | 0.1% | |
| 126 | 683 | 0.1% | |
| 115 | 667 | 0.1% | |
| 121 | 665 | 0.1% | |
| 141 | 661 | 0.1% | |
| 185 | 659 | 0.1% | |
| 174 | 656 | 0.1% | |
| 133 | 655 | 0.1% | |
| 156 | 654 | 0.1% | |
| 201 | 653 | 0.1% | |
| Other values (16792) | 601779 | 73.0% | |
| (Missing) | 215727 | 26.2% |
| Value | Count | Frequency (%) | |
| 0 | 7 | < 0.1% | |
| 2 | 1 | < 0.1% | |
| 3 | 5 | < 0.1% | |
| 4 | 7 | < 0.1% | |
| 5 | 7 | < 0.1% |
| Value | Count | Frequency (%) | |
| 105055 | 1 | < 0.1% | |
| 101087 | 1 | < 0.1% | |
| 86903 | 1 | < 0.1% | |
| 86895 | 1 | < 0.1% | |
| 86513 | 1 | < 0.1% |
| Distinct count | 81134 |
|---|---|
| Unique (%) | 13.3% |
| Missing | 215727 |
| Missing (%) | 26.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 16858.884162065944 |
|---|---|
| Minimum | 8.0 |
| Maximum | 934266.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.3 MiB |
Quantile statistics
| Minimum | 8 |
|---|---|
| 5-th percentile | 1350 |
| Q1 | 3670 |
| median | 8009 |
| Q3 | 16020 |
| 95-th percentile | 66871.2 |
| Maximum | 934266 |
| Range | 934258 |
| Interquartile range (IQR) | 12350 |
Descriptive statistics
| Standard deviation | 30011.05309 |
|---|---|
| Coefficient of variation (CV) | 1.780132825 |
| Kurtosis | 76.52416301 |
| Mean | 16858.88416 |
| Median Absolute Deviation (MAD) | 5145 |
| Skewness | 6.436099518 |
| Sum | 1.025726544e+10 |
| Variance | 900663307.4 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 1020 | 92 | < 0.1% | |
| 997 | 89 | < 0.1% | |
| 1525 | 87 | < 0.1% | |
| 1032 | 86 | < 0.1% | |
| 1209 | 85 | < 0.1% | |
| 1039 | 84 | < 0.1% | |
| 1196 | 82 | < 0.1% | |
| 1198 | 82 | < 0.1% | |
| 1184 | 82 | < 0.1% | |
| 1013 | 82 | < 0.1% | |
| Other values (81124) | 607568 | 73.7% | |
| (Missing) | 215727 | 26.2% |
| Value | Count | Frequency (%) | |
| 8 | 1 | < 0.1% | |
| 21 | 1 | < 0.1% | |
| 95 | 1 | < 0.1% | |
| 132 | 1 | < 0.1% | |
| 159 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 934266 | 1 | < 0.1% | |
| 922543 | 1 | < 0.1% | |
| 915010 | 2 | < 0.1% | |
| 904750 | 1 | < 0.1% | |
| 886967 | 1 | < 0.1% |
| Distinct count | 665 |
|---|---|
| Unique (%) | 0.1% |
| Missing | 215727 |
| Missing (%) | 26.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 91.29663751460751 |
|---|---|
| Minimum | 0.7 |
| Maximum | 100.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.3 MiB |
Quantile statistics
| Minimum | 0.7 |
|---|---|
| 5-th percentile | 81.1 |
| Q1 | 89.4 |
| median | 92.6 |
| Q3 | 94.7 |
| 95-th percentile | 96.9 |
| Maximum | 100 |
| Range | 99.3 |
| Interquartile range (IQR) | 5.3 |
Descriptive statistics
| Standard deviation | 5.333850844 |
|---|---|
| Coefficient of variation (CV) | 0.05842330002 |
| Kurtosis | 10.98365427 |
| Mean | 91.29663751 |
| Median Absolute Deviation (MAD) | 2.5 |
| Skewness | -2.352753332 |
| Sum | 55546608.9 |
| Variance | 28.44996483 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 94.1 | 7602 | 0.9% | |
| 94.3 | 7416 | 0.9% | |
| 93.9 | 7411 | 0.9% | |
| 94.5 | 7394 | 0.9% | |
| 94 | 7389 | 0.9% | |
| 93.8 | 7386 | 0.9% | |
| 94.6 | 7350 | 0.9% | |
| 94.2 | 7328 | 0.9% | |
| 93.6 | 7303 | 0.9% | |
| 94.4 | 7282 | 0.9% | |
| Other values (655) | 534558 | 64.9% | |
| (Missing) | 215727 | 26.2% |
| Value | Count | Frequency (%) | |
| 0.7 | 1 | < 0.1% | |
| 1.7 | 1 | < 0.1% | |
| 9.9 | 1 | < 0.1% | |
| 12 | 1 | < 0.1% | |
| 12.1 | 1 | < 0.1% |
| Value | Count | Frequency (%) | |
| 100 | 8 | < 0.1% | |
| 99.9 | 3 | < 0.1% | |
| 99.8 | 9 | < 0.1% | |
| 99.7 | 15 | < 0.1% | |
| 99.6 | 21 | < 0.1% |
| Distinct count | 2440 |
|---|---|
| Unique (%) | 0.4% |
| Missing | 215727 |
| Missing (%) | 26.2% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 110.23824371033778 |
|---|---|
| Minimum | 5.0 |
| Maximum | 5940.0 |
| Zeros | 0 |
| Zeros (%) | 0.0% |
| Memory size | 6.3 MiB |
Quantile statistics
| Minimum | 5 |
|---|---|
| 5-th percentile | 9 |
| Q1 | 24 |
| median | 53 |
| Q3 | 105 |
| 95-th percentile | 439 |
| Maximum | 5940 |
| Range | 5935 |
| Interquartile range (IQR) | 81 |
Descriptive statistics
| Standard deviation | 195.7382183 |
|---|---|
| Coefficient of variation (CV) | 1.775592678 |
| Kurtosis | 82.52221113 |
| Mean | 110.2382437 |
| Median Absolute Deviation (MAD) | 34 |
| Skewness | 6.678971413 |
| Sum | 67071042 |
| Variance | 38313.45012 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 8 | 8683 | 1.1% | |
| 11 | 8677 | 1.1% | |
| 9 | 8640 | 1.0% | |
| 10 | 8554 | 1.0% | |
| 13 | 8485 | 1.0% | |
| 7 | 8454 | 1.0% | |
| 12 | 8396 | 1.0% | |
| 14 | 8198 | 1.0% | |
| 18 | 8157 | 1.0% | |
| 16 | 8085 | 1.0% | |
| Other values (2430) | 524090 | 63.6% | |
| (Missing) | 215727 | 26.2% |
| Value | Count | Frequency (%) | |
| 5 | 750 | 0.1% | |
| 6 | 7259 | 0.9% | |
| 7 | 8454 | 1.0% | |
| 8 | 8683 | 1.1% | |
| 9 | 8640 | 1.0% |
| Value | Count | Frequency (%) | |
| 5940 | 1 | < 0.1% | |
| 5863 | 1 | < 0.1% | |
| 5846 | 2 | < 0.1% | |
| 5774 | 1 | < 0.1% | |
| 5668 | 1 | < 0.1% |
| Distinct count | 934 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 220730 |
| Missing (%) | 26.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.967415514338366 |
|---|---|
| Minimum | 0.0 |
| Maximum | 1596.0 |
| Zeros | 18887 |
| Zeros (%) | 2.3% |
| Memory size | 6.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 1 |
| Q1 | 5 |
| median | 12 |
| Q3 | 28 |
| 95-th percentile | 114 |
| Maximum | 1596 |
| Range | 1596 |
| Interquartile range (IQR) | 23 |
Descriptive statistics
| Standard deviation | 52.3267896 |
|---|---|
| Coefficient of variation (CV) | 1.870991246 |
| Kurtosis | 84.97170772 |
| Mean | 27.96741551 |
| Median Absolute Deviation (MAD) | 9 |
| Skewness | 6.748101366 |
| Sum | 16875986 |
| Variance | 2738.09291 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 3 | 31437 | 3.8% | |
| 2 | 30832 | 3.7% | |
| 4 | 29434 | 3.6% | |
| 1 | 27914 | 3.4% | |
| 5 | 27716 | 3.4% | |
| 6 | 25699 | 3.1% | |
| 7 | 23851 | 2.9% | |
| 8 | 21820 | 2.6% | |
| 9 | 20284 | 2.5% | |
| 0 | 18887 | 2.3% | |
| Other values (924) | 345542 | 41.9% | |
| (Missing) | 220730 | 26.8% |
| Value | Count | Frequency (%) | |
| 0 | 18887 | 2.3% | |
| 1 | 27914 | 3.4% | |
| 2 | 30832 | 3.7% | |
| 3 | 31437 | 3.8% | |
| 4 | 29434 | 3.6% |
| Value | Count | Frequency (%) | |
| 1596 | 1 | < 0.1% | |
| 1586 | 1 | < 0.1% | |
| 1576 | 1 | < 0.1% | |
| 1465 | 1 | < 0.1% | |
| 1458 | 1 | < 0.1% |
| Distinct count | 984 |
|---|---|
| Unique (%) | 0.2% |
| Missing | 220730 |
| Missing (%) | 26.8% |
| Infinite | 0 |
| Infinite (%) | 0.0% |
| Mean | 27.720951880626302 |
|---|---|
| Minimum | 0.0 |
| Maximum | 100.0 |
| Zeros | 18887 |
| Zeros (%) | 2.3% |
| Memory size | 6.3 MiB |
Quantile statistics
| Minimum | 0 |
|---|---|
| 5-th percentile | 3.1 |
| Q1 | 13.5 |
| median | 25 |
| Q3 | 39.4 |
| 95-th percentile | 61.3 |
| Maximum | 100 |
| Range | 100 |
| Interquartile range (IQR) | 25.9 |
Descriptive statistics
| Standard deviation | 18.06207498 |
|---|---|
| Coefficient of variation (CV) | 0.6515676324 |
| Kurtosis | 0.04975180387 |
| Mean | 27.72095188 |
| Median Absolute Deviation (MAD) | 12.5 |
| Skewness | 0.6709151033 |
| Sum | 16727265.9 |
| Variance | 326.2385527 |
Histogram with fixed size bins (bins=10)
| Value | Count | Frequency (%) | |
| 0 | 18887 | 2.3% | |
| 33.3 | 10646 | 1.3% | |
| 50 | 9534 | 1.2% | |
| 25 | 8414 | 1.0% | |
| 16.7 | 7247 | 0.9% | |
| 20 | 6648 | 0.8% | |
| 14.3 | 6222 | 0.8% | |
| 28.6 | 5454 | 0.7% | |
| 12.5 | 5211 | 0.6% | |
| 40 | 4670 | 0.6% | |
| Other values (974) | 520483 | 63.2% | |
| (Missing) | 220730 | 26.8% |
| Value | Count | Frequency (%) | |
| 0 | 18887 | 2.3% | |
| 0.2 | 1 | < 0.1% | |
| 0.3 | 11 | < 0.1% | |
| 0.4 | 41 | < 0.1% | |
| 0.5 | 44 | < 0.1% |
| Value | Count | Frequency (%) | |
| 100 | 336 | < 0.1% | |
| 99 | 1 | < 0.1% | |
| 98.9 | 1 | < 0.1% | |
| 98.6 | 1 | < 0.1% | |
| 98.5 | 2 | < 0.1% |
Pearson's r
The Pearson's correlation coefficient (r) is a measure of linear correlation between two variables. It's value lies between -1 and +1, -1 indicating total negative linear correlation, 0 indicating no linear correlation and 1 indicating total positive linear correlation. Furthermore, r is invariant under separate changes in location and scale of the two variables, implying that for a linear function the angle to the x-axis does not affect r.To calculate r for two variables X and Y, one divides the covariance of X and Y by the product of their standard deviations.
Spearman's ρ
The Spearman's rank correlation coefficient (ρ) is a measure of monotonic correlation between two variables, and is therefore better in catching nonlinear monotonic correlations than Pearson's r. It's value lies between -1 and +1, -1 indicating total negative monotonic correlation, 0 indicating no monotonic correlation and 1 indicating total positive monotonic correlation.To calculate ρ for two variables X and Y, one divides the covariance of the rank variables of X and Y by the product of their standard deviations.
Kendall's τ
Similarly to Spearman's rank correlation coefficient, the Kendall rank correlation coefficient (τ) measures ordinal association between two variables. It's value lies between -1 and +1, -1 indicating total negative correlation, 0 indicating no correlation and 1 indicating total positive correlation.To calculate τ for two variables X and Y, one determines the number of concordant and discordant pairs of observations. τ is given by the number of concordant pairs minus the discordant pairs divided by the total number of pairs.
Phik (φk)
Phik (φk) is a new and practical correlation coefficient that works consistently between categorical, ordinal and interval variables, captures non-linear dependency and reverts to the Pearson correlation coefficient in case of a bivariate normal input distribution. There is extensive documentation available here.Cramér's V (φc)
Cramér's V is an association measure for nominal random variables. The coefficient ranges from 0 to 1, with 0 indicating independence and 1 indicating perfect association. The empirical estimators used for Cramér's V have been proved to be biased, even for large samples. We use a bias-corrected measure that has been proposed by Bergsma in 2013 that can be found here.First rows
| DBN | School Name | Grade | Year | Demographic Category | Demographic Variable | # Total Days | # Days Absent | # Days Present | % Attendance | # Contributing 20+ Total Days | # Chronically Absent | % Chronically Absent | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 0 | 01M015 | P.S. 015 Roberto Clemente | All Grades | 2013-14 | All Students | All Students | 34803 | 2783.0 | 32020.0 | 92.0 | 216.0 | 58.0 | 26.9 |
| 1 | 01M015 | P.S. 015 Roberto Clemente | All Grades | 2014-15 | All Students | All Students | 33455 | 2374.0 | 31081.0 | 92.9 | 197.0 | 46.0 | 23.4 |
| 2 | 01M015 | P.S. 015 Roberto Clemente | All Grades | 2015-16 | All Students | All Students | 29840 | 2071.0 | 27769.0 | 93.1 | 186.0 | 51.0 | 27.4 |
| 3 | 01M015 | P.S. 015 Roberto Clemente | All Grades | 2016-17 | All Students | All Students | 30601 | 1994.0 | 28607.0 | 93.5 | 193.0 | 48.0 | 24.9 |
| 4 | 01M015 | P.S. 015 Roberto Clemente | All Grades | 2017-18 | All Students | All Students | 33264 | 2078.0 | 31186.0 | 93.8 | 195.0 | 37.0 | 19.0 |
| 5 | 01M015 | P.S. 015 Roberto Clemente | All Grades | 2018-19 | All Students | All Students | 30887 | 2278.0 | 28609.0 | 92.6 | 186.0 | 45.0 | 24.2 |
| 6 | 01M015 | P.S. 015 Roberto Clemente | PK in K-12 Schools | 2013-14 | All Students | All Students | 4711 | 560.0 | 4151.0 | 88.1 | 30.0 | 16.0 | 53.3 |
| 7 | 01M015 | P.S. 015 Roberto Clemente | PK in K-12 Schools | 2014-15 | All Students | All Students | 3395 | 484.0 | 2911.0 | 85.7 | 23.0 | 15.0 | 65.2 |
| 8 | 01M015 | P.S. 015 Roberto Clemente | PK in K-12 Schools | 2015-16 | All Students | All Students | 2193 | 248.0 | 1945.0 | 88.7 | 18.0 | 9.0 | 50.0 |
| 9 | 01M015 | P.S. 015 Roberto Clemente | PK in K-12 Schools | 2016-17 | All Students | All Students | 2844 | NaN | NaN | NaN | NaN | NaN | NaN |
Last rows
| DBN | School Name | Grade | Year | Demographic Category | Demographic Variable | # Total Days | # Days Absent | # Days Present | % Attendance | # Contributing 20+ Total Days | # Chronically Absent | % Chronically Absent | |
|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 824136 | 75X811 | P.S. X811 | 12 | 2014-15 | ELL Status | ELL | 11837 | 1953.0 | 9884.0 | 83.5 | 67.0 | 40.0 | 59.7 |
| 824137 | 75X811 | P.S. X811 | 12 | 2014-15 | ELL Status | Not ELL | 37472 | 6218.0 | 31254.0 | 83.4 | 213.0 | 122.0 | 57.3 |
| 824138 | 75X811 | P.S. X811 | 12 | 2015-16 | ELL Status | ELL | 16332 | 2606.0 | 13726.0 | 84.0 | 96.0 | 56.0 | 58.3 |
| 824139 | 75X811 | P.S. X811 | 12 | 2015-16 | ELL Status | Not ELL | 37639 | 5673.0 | 31966.0 | 84.9 | 222.0 | 115.0 | 51.8 |
| 824140 | 75X811 | P.S. X811 | 12 | 2016-17 | ELL Status | ELL | 22436 | 3809.0 | 18627.0 | 83.0 | 132.0 | 85.0 | 64.4 |
| 824141 | 75X811 | P.S. X811 | 12 | 2016-17 | ELL Status | Not ELL | 32558 | 5543.0 | 27015.0 | 83.0 | 192.0 | 98.0 | 51.0 |
| 824142 | 75X811 | P.S. X811 | 12 | 2017-18 | ELL Status | ELL | 22818 | 4079.0 | 18739.0 | 82.1 | 133.0 | 81.0 | 60.9 |
| 824143 | 75X811 | P.S. X811 | 12 | 2017-18 | ELL Status | Not ELL | 34542 | 5791.0 | 28751.0 | 83.2 | 201.0 | 112.0 | 55.7 |
| 824144 | 75X811 | P.S. X811 | 12 | 2018-19 | ELL Status | ELL | 24910 | 4837.0 | 20073.0 | 80.6 | 147.0 | 98.0 | 66.7 |
| 824145 | 75X811 | P.S. X811 | 12 | 2018-19 | ELL Status | Not ELL | 37488 | 5849.0 | 31639.0 | 84.4 | 215.0 | 102.0 | 47.4 |